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1 Delaying physical register allocation through virtual-physical registers 
Teresa Monreal, Antonio Gonzalez, Mateo Valero, Jose Gonzalez, Victor Vinals 
November 1999 Proceedings of the 32nd annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: ^pdf(865.88 KB) 



1 Publisher Site 



Additional Information: julLcitation, abstract., references, citings, indexjerms 



Register file access time represents one of the critical delays of current microprocessors, and 
it is expected to become more critical as future processors increase the instruction window 
size and the issue width. This paper present a novel physical register management scheme 
that allows for a late allocation (at the end of execution) of registers. We show that it can 
provide significant savings in number of registers and thus, it can significantly shorter the 
register file access t ... 

2 Register file and memory system design: Three-dimensional memory vectorization for 
high bandwid^ 

Jesus Corbal, Roger Espasa, Mateo Valero 

November 2002 Proceedings of the 35th annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: ~ nfjj] 

^.P.djAlv2&MBj..™ Additional Information: full citation, abstract, references^ index terms 
Publisher Site 

Vector processors have good performance, cost and adaptability when targeting multimedia 
applications. However, for a significant number of media programs, conventional memory 
configurations fail to deliver enough memory references per cycle to feed the SIMD 
functional units. This paper addresses the problem of the memory bandwidth. We propose a 
novel mechanism suitable for 2-dimensional vector architectures and targeted at providing 
high effective bandwidth for SIMD memory instructions. The basi ... 

3 Modulo scheduling: Modulo scheduling with integrated register spilling for clustered 
VUWarchte 

Javier Zalamea, Josep Llosa, Eduard Ayguade, Mateo Valero 

December 2001 Proceedings of the 34th annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available:,™, @] 

^P.atiLQS.MBx^ Additional Information: full citation, abstract, references, citings 
Publisher Site 
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Clustering is a technique to decentralize the design of future wide issue VLIW cores and 
enable them to meet the technology constraints in terms of cycle time, area and power 
dissipation. In a clustered design, registers and functional units are grouped in clusters so 
that new instructions are needed to move data between them. New aggressive instruction 
scheduling techniques are required to minimize the negative effect of resource clustering and 
delays in moving data around. In this paper we pres ... 



4 IwoJeveJ„hjerarc 
Javier Zalamea, Josep Llosa, Eduard Ayguade, Mateo Valero 

December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: ^.pdg154 s 80 KB) 

ps{643.65 KB) Additional Information: MLsMtjon, references, citings, jndexterrns 
Publisher Site 



5 Multiple-banked register file architectures | 
Jose-Lorenzo Cruz, Antonio Gonzalez, Mateo Valero, Nigel P. Topham 

May 2000 ACM SIGARCH Computer Architecture News , Proceedings of the 27th annual 

international symposium on Computer architecture, volume 28 issue 2 
Full text available: ■^.J?dg106 ; 23. KB) Additional Information: Mlcitalion, abstract, references, citings, index terms 

The register file access time is one of the critical delays in current superscalar processors. Its 
impact on processor performance is likely to increase in future processor generations, as 
they are expected to increase the issue width (which implies more register ports) and the 
size of the instruction window (which implies more registers), and to use some kind of 
multithreading. Under this scenario, the register file access time could be a dominant delay 
and a pipelined implementation would ... 

Keywords: bypass logic, dynamically-scheduled processor, register file architecture, register 
file cache 



6 A performance study of out-of-order vector architectures and short registers 
Luis Vijla, Roger Espasa, Mateo Valero 

July 1998 Proceedings of the 12th international conference on Supercomputing 

Full text available: i ffi.p.dfi1.09 MB). Additional Information: Ml.cM.ion, references, jndex terms 



7 A victim cache for vector registers 
Roger Espasa, Mateo Valero 

July 1997 Proceedings of the 11th international conference on Supercomputing 

Full text available: ^.pdf(125 MB). Additional Information: MLcjtatjon, reMences, cjtincjs, jndex terms 



8 HeMstics for register^ 
Josep Llosa, Mateo Valero, Eduard Ayguad6 

December 1996 Proceedings of the 29th annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: ||| pdfl1.44 MB) Additional Information: full citation, abstract, references, citings, index terms 

Software Pipelining is a loop scheduling technique that extracts parallelism from loops by 
overlapping the execution of several consecutive iterations. There has been a significant 
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effort to produce throughput-optimal schedules under resource constraints, and more 
recently to produce throughput-optimal schedules with minimum register requirements. 
Unfortunately even a throughput-optimal schedule with minimum register requirements is 
useless if it requires more registers than those available in t ... 

9 A Content Aware integer Register File Organization | 
Gonzalez Gonzalez, Adrian Cristal, Daniel Ortega, Alexander Veidenbaum, Mateo Valero 
March 2004 ACM SIGARCH Computer Architecture News , Proceedings of the 31st 

annual international symposium on Computer architecture - Volume 00, 

Volume 32 Issue 2 

Full text available: ^.pdf(6,48 MB). Additional Information: MlSlMio-L abstract 

A register file is a critical component of a modernsuperscalar processor.lt has a large number 
of entriesand read/write ports in order to enable high levels ofinstruction parallelism. As a 
result, the register file'sarea, access time, and energy consumption increasedramatically, 
significantly affecting the overallsuperscalar processor's performance and 
energyconsumption.This is especially true in 64-bitprocessors.This paper presents a new 
integer register fileorganization, which reduces energy co ... 
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